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ft.  Abstract 

Spreading  Activation  and  Lateral  Inhibition  are  highly  parallel 
information  processing  techniques  which  have  been  need  vith  some  success 
in  simulating  human  cognitive  facultiea  on  computers.  Unfortunately, 
these  simulations  run  like  molasses  on  serial  machines.  This  paper 
describes  the  design  of  a  VLSI-bated  architecture  for  the  parallel 
simulation  of  activation  and  inhibition. 
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1.  Introduction 

Spreading  Activation  and  Lateral  Inhibition  are  highly 
parallel  information  processing  techniques  which  have  been  nsed  with 
some  success  in  simulating  human  cognitive  faculties  on  computers. 

Work  on  computer  vision  involving  "relaxation"  techniques2,  research 
in  visual  perception  of  letters  and  words2,  simulations  of  motor 
control2,  and  research  into  the  understanding  of  human  memory^  all  use 
notions  of  activation  and  inhibition.  The  author's  main  research 
interest  is  in  how  these  concepts  are  useful  for  Natural  Language 
Processing.2 

Activation  and  inhibition  are  highly  parallel  and  local  iterative 
processes  which  are  applied  to  weighted  networks.  While  they  can  be 
(and  always  have  been)  simulated  on  serial  computers,  where  the  compute 
time  per  iteration  is  linear  with  respect  to  the  total  number  of  links 
in  the  network,  they  also  could  easily  be  simulated  on  parallel 
machines,  where  the  computation  time  per  iteration  would  be  constant. 

We  have  completed  most  of  the  preliminary  design  and  NMOS  layout  of 
a  parallel  machine  for  the  simulation  of  an  activation  and  inhibition 
network.  Each  node  in  a  network  maps  into  a  single  VLSI  cell2  which  has 
storage  for  its  activation  level,  its  links  to  other  nodes,  and  messages 
(i.e.  contributions)  to  be  sent.  Each  cell  is  physically  connected  to 
only  two  other  cells,  therefore,  indirect  messages  are  forwarded.  The 
machine  works  in  a  cyclic  fashion:  Every  cycle,  each  cell  generates  a 
bunch  of  messages  and  sends  them  all  out,  then  updates  its  activation 
level  based  on  the  messages  it  received.  A  block  diagram  is  shown  in 
Figure  1. 


See,  for  example,  [Waltx,  1973]  or  [Hinton, 1977] . 

2 

[McClelland  and  Eumelhart,  1981]  demonstrate  how  an 
activation/ inhibition  model  can  account  for  data  from  many  studies  of 
human  perception  of  letters  in  context. 

2 

[Eumelhart  and  Norman, 1982]  have  an  activation  based  model  of  a  typ¬ 
ist,  complete  with  errors. 

^Minsky's  paper  on  E-lines  [Minsky, 1980]  speculated  that  computation 
in  the  human  brain  may  be  organised  as  a  society  of  interacting  local 
agents,  whose  influence  on  each  other  occurs  through  activation  and  in¬ 
hibition. 

2See  AABG  Working  Paper  35  for  details. 

*  A  current  controversy  in  the  field  of  parallel  associative 
memories  is  whether  there  really  should  be  a  one  to  one  mapping  between 
nodes  in  an  associative  network  and  nodes  in  a  physical  network. 
Hinton  [1981]  gives  some  compelling  arguments  why  there  should  not  be. 
However,  while  all  involved  in  the  controversy  agree  that  fully  distri¬ 
buted  memory  is  a  powerful  concept,  no  one  has  found  a  way  of  doing 
activation  and  inhibition  on  it,  so  I'll  stick  to  the  simple  napping. 
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1.1.  BACKGROUND 

The  aetiTatioa/iahibitioa  ae twork  we  ere  working  with  it  a 
weighted  and  labeled  directed  graph*  where  the  nodes  represent  eoaeepts 
aad  the  links  represent  binary  relations  between  eoneepts.  Node 
weights*  V.(w),  represent  aotiwatioa  lewels  (a  neasore  of  relewanee)* 
aad  link  weights*  L^. ,  represent  strength  of  aotiwatioa  (if  positiwe)  or 
of  inhibition  (if  hegative).  The  processes  of  spreading  aotiwatioa  and 
lateral  inhibition  inwolwe  the  iteratiwe  recoapatation  of  the  aotiwatioa 
lewel  for  eaeh  node  based  on  its  weighted  eonneetions.  At  each  cycle  t* 
ewery  node  reoeiwes  a  contribution  froa  eaeh  of  its  neighboring  nodes 
equivalent  to  the  neighbor's  aetiwation  lewel  anltiplied  by  the  weight 
of  the  interwening  link: 


c,<*>  -  £  IjWly 

This  contribution  (sealed  to  range  between  -1  aad  1)  causes  a 
proportional  change  in  the  aotiwatioa  lewel  of  the  node  for  thin  next 
iteration: 


Vt+1)  “»!<*)  +  aaz(C|(T)  ••)'<*-*!<«))  +  ain(Ci(w)  ,0)  *  (W^tJ-a) 

So  a  contribution  of  1  saps  the  node  up  to  its  aaxiaua  aetiwation  lewel* 
M.  while  a  contribution  of  -1  saps  the  node  down  to  its  ainiaua,  a. 
Eventually,  a  static  condition  is  reached  where  soae  nodes  reach  their 
aaxiaua  or  ainiaua  strength*  while  the  rest  of  thea  reeeiwe 
contributions  of  0,  when  the  positiwe  and  negatiwe  contributions 
balance. 

Serious  use  of  aetiwation/ inhibition  networks  could  require 
thousands  of  nodes  and  hundreds  of  such  cycles.  On  s  serial  aaohine* 
execution  will  be  very  slow  due  to  the  aultiple  access  of  a  very 
large  neaory;  so  slow,  in  fact,  that  cognitive  researchers  usually 
coaproaise  their  theories  just  to  get  acceptable  run~tiaes,  thereby 
eliainating  possibly  iaportaat  effects. 

If  an  aetiwation  network  were  iapleaented  in  hardware,  it  would 
provide  a  vehicle  for  this  type  of  resesrch  with  real-tine  response, 
and  could  very  well  stiaulate  research  in  this  area. 
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There  are  aany  problens  in  hardware,  however,  which  do  not  arise  in 

software.  The  aain  costs  in  highly  concurrent  systeas  are  connection  _ _ 

and  coaanni cation,  not  eoaputation.  Indeed,  in  a  parallel  siaulation  i  For 
of  an  activation/ inhibition  network,  ewery  aetiwation  cycle  will  \g£ 
require  a  full  barrage  of  aessages  to  be  sent,  but  relatively  little 
eoaputation.  ed 


The  first  problea  is  that  of  physical  connectivity.  It  is  well 
known  that  siaple,  regular  interconnection  scheaes  (i.e.  tesselations 
of  the  plane,  trees,  etc.)  are  aore  feasible  aad  less  expensive  than 


aassiwely  parallel  (i.e. 


crossbar)  interconnections^. 


Assuaiug  onA _ 

Availability  Codes 
[Avail  and/or  ~ 
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•  on  aode  to  one  VLSI  eoll  correspondence *  there  or*  ubj 
interconnections  (i.o.  liaka)  which  need  to  bo.  made.  Gim  that  a 
siaple  aad  local  physical  iatcrcoaacctioa  aeheac  ia  used*  the  ay a tea 
aoat  atill  support  a  large  logical  iatereoaaeetioa,  i.e.  proeeaaora  neat 
be  able  to  aead  Boscages  to  aoa-adjaeeat  proeeaaora.  Logical 
iatereoaaeetioa  aast  also  deal  effectively  with  aessage  traffic  to 
avoid  bottleaeehs  aad  deadloeh  eoaditioas. 

A  third  iatereoaaeetioa  problea  is  at  the  prograaaiag  level; 
how  to  specify  a  subset  of  the  logical  iatereoaaeetioa  to 
be  aaed  aad  how  to  reeoafigare  it. 

Finally,  the  problea  of  eeatral  eoatrol  is  very  iaportaat: 
Gives  a  syatea  which  eoald  exhibit  a  aassive  degree  of  parallelisa,  it 
would  be  iaeffieieat  (aot  to  aeatioa  iaelegaat)  to  pat  it  at  the 

service  of  a  eeatral  serial  processor*.  Related  to  the  problea  of 
eeatral  eoatrol  is  the  problea  of  addreasiag:  Are  there  aay  alternatives 
to  fixed  loeatioas  ia  a  fixed  address  spaee? 

1.2.  A  Satisfieiaa*  Peaiaa 

The  desiga  for  a  aotivatioa/iahibitioa  aetwork  eell  deseribed 
below  provides  satisfactory  solatioas  to  soae  of  the  aforeaeatioaed 
probleas  aad  soffieieat  solatioas  to  others.  There  are  3  desiga 
paraaeters:  a  vhieh  is  the  address  width*  3  which  is  the  bit  width  for 
arithaetie.  aad  p  whieh  is  the  site  of  aessage  queues.  Usiag  the 
siaplest  physical  iatereoaaeetivity  (liaear  adjaeeaey)  it  eaa  achieve 
a  large  logieal  iatereoaaeetivity  (eaeh  proeesaor  eaa  talk  to  2#+1 
others)  aad  get  a  fall  act  of  aessages  seat  without  possibility  of 
saturatioa  or  deadloeh  la  constant10  tiae.  Furtheraore*  the 
eoaaeetioas  aay  be  prograaaad  quite  easily*  the  addreasiag  is 
fixed-width  but  relative  to  eaeh  eell*  there  is  ao  eeatral  eoatrol.  aad 
the  aaouat  of  real-estate  is  proportioaal  to  p(a+J}) . 

This  perforaaaee  is  aehievable  because  the  aessages  ia  aa 
aotivatioa/iahibitioa  aetwork  (i.e.  neighborly  eontributioas)  are 
sltiaately  to  be  suaaed;  so  they  aay  be  suaaed  as  they  are  forwarded. 
If  eaeh  aode  scads  and  reeeives  two  aessages  per  eyele*  aad  the 
iaeoaiag  aessages  are  aerged  with  queued  aessages  having  the  saae 
destination*  the  auaber  of  aessages  queued  esaaot  increase. 


7 

Sutherlsad  aad  Mead  [1977]  diseuss  the  iaportaaee  of  having  siaple 
aad  regular  geeaetries  ia  VLSI . 

*  For  exaaple*  Fahlaaa’s  [1979]  design  for  a  set-interseetion 
aaehiae  is  haapered  by  its  connection  to  a  eeatral  proeessor  via  a  very 
overloaded  bus. 

Satisfying  +  Sufficing  ■  Satisfieiag*  a  tern  eoined  by  Herbert 
Siaoa  ia  his  elassie  "The  Seienees  of  the  Artifieial"  [Caabridge:  MIT 
Press.  1969] . 

^  Ia  this  desiga*  the  tiae  is  (*(2®).  The  best  aehievable  aessage 
throughput  would  be  0(g)*  but  this  eanaot  be  done  with  a  siaple,  regular 
iatereoaaeetioa  seheae. 
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Futkitaoi*.  if  each  aode  sends  oat  its  longest  aessage  first*  then  the 
length  of  the  longest  aessage  is  always  deereasing.  Since  the  aasiana 
distance  is  2a-l,  it  is  also  the  suxiaua  nnaber  of  cycles  to  coaplete  a 
fall  activation  cycle. 

The  system  works  with  two  two-phase.  non-overlapping  clocks. 
One  of  the  clocks,  d.  (on  the  order  of  the  gate  delay  for  an  0-bit 
earry-propagate  adder)  is  ased  for  the  internal  clocking  of  the  cells. 
The  other.  LOAD,  has  a  short  phase  (0(p#))  ased  for  loading  the  aessage 
qneoes  and  a  long  phase  (0(2*))  ased  for  allowing  the  aessages  to 
coapletely  propagate.  Note  that  the  problea  of  clock  skew  is  ainiaal 
because  of  cell  adjacency. 

The  basic  aessage  passing  behavior  of  the  cell  is  two-  phase.  In 
phase  one  of  0,  aessages  are  passed  left  while  the  next  aessage  going 
right  is  selected;  in  phase  two.  aessages  are  passed  to  the  right  while 
the  next  aessage  going  left  is  selected.  The  aessage  queues  are  dynaaic 
sorting  arrays  in  which  alternating  pairs  of  aessages  are  sorted  eaoh 
alternating  phase  of  d. 

An  NHDS  layout  was  perforaed  for  this  cell  in  which  the  design 
paraaeters  were  o«4,  0*4  and  p«8.  The  fall  layoat  is  quite  large  (in 
order  to  see  detail),  so  a  condensed  and  outlined  version  is  shown  in 
Figure  2. 

1.  RgmlpUPB  Si  Cc.ll  god  Components 

The  cell  is  composed  of  two  equal  parts,  one  which  handles 
aessages  going  left  to  right,  and  the  other  handles  aessages  going 
right  to  left.  Each  of  these  sub-processors  is  ooaposed  of  three 
sections,  the  input/load  section,  the  update  section  and  the 
eoaauaication  section. 

The  input/ load  section  is  responsible  for  loading  the  prograaaed 
connections  (i.e.  weighted  links)  into  the  eonaunieation  section 
and  accepting  new  links  (froa  an  unspecified  external  agent). 
Vhile  in  the  not-LOAD  phase,  new  links  aay  be  presented  and  will  be 
integrated  into  the  (sorted)  link  aeaory.  In  the  LOAD  phase,  the 
links  are  both  passed  into  the  update  section  to  be  aultiplied  by 
the  current  aotivation  level  and  fed  back  around  through  p  delay 
stages  into  the  sorted  aeaory. 

The  update  section  is  responsible  for  maintaining  the  activation 
level  of  the  cell  based  on  the  current  level  and  on  the  received 
aessages  which  filter  up  froa  the  coaaunication  section.  This 
portion,  which  should  be  a  nicroeoded  arithmetic  unit,  has  not  been 
designed  yet. 

The  coaaunication  section  is  responsible  for  aessage  passing  and 
aerging.  It  contains  a  sorting  aeaory  which  holds  aessages  to  go 
out,  filters  incoming  aessages  to  the  top  (which  is  input  to  the 
update  section),  and  nerges  aessages  to  be  forwarded. 

2. JL.  Innnt/Load  Section 

The  aain  components  of  the  inpnt/load  sections  are  the 
input /feedback  selector,  the  delay  stages,  the  sorting  array. 
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tad  the  load/hold  selector.  The  input/feedback  selector  is  s 
siaple  l-of-2  selector  based  on  the  LOAD  signal;  if  LOAD  is  true,  it 
allows  the  (feedback)  data  froa  the  bottoa  of  the  sortiag  array  into 
the  delay  stages*  if  LOAD  is  false,  it  allows  the  iapat  (new  links) 
iato  the  delay  stages.  The  delay  stages  are  siaple  dynaaic  registers 
which  are  clocked  ia  both  phases  of  d.  The  load/hold  selector 
controls  the  iapat  to  the  sraltiplier  and  feedback;  when  LOAD  is  trae, 
it  feeds  the  bottoa  of  the  sort  array  back  iato  itself  and  sends  a  0 
iato  the  aaltiplier;  when  LOAD  is  false*  it  passes  the  oatpat  of  the 
sort  array  iato  the  aaltiplier  (and  feedback  loop)  and  a  0  back  iato 
the  sort  array.  The  layoats  for  the  two  types  of  selector  and  the 
delay  stage  are  shown  oa  ia  Figaro  3 . 

2.1.1.  The  Sort/Morie  Array 

The  sort/aerge  array  (or  sortiag  aeaory)  is  the  principle 
coapoaeat  in  ia  the  activation/ inhibition  network  cell.  Its  goal  is  to 
aaiataia  a  list  of  address: weight  pairs  ia  sorted  order,  aad  it  is 
ased  iu  several  ways.  Ia  the  iapat/load  section  of  the  cell*  it  is 
ased  as  a  aergiag  aeaory  for  prograamed  links,  aad  as  a  shift 
register  which  shifts  oat  the  longest  (i.e.  aaxiaal  relative  address) 
link  first.  Ia  the  coaaaaicatioa  section,  it  is  ased  as  a  aessage  qaeae 
which  filters  ap  iapat  aessages,  aad  sorts  aad  aerges  oatpat  aessages  to 
ainiaixe  aessage  delay  aad  traffic. 

It  is  essentially  a  very  fancy  ap/dowa  shift  register  which  is 
coaposed  bit-wise  of  a  eoapare/swap  cells  and  3  add/swap  cells 
sandwiched  between  dynaaic  registers.  Dariag  alternate  phases  of  the  p 
clock,  eaeh  register  is  coapared  alternatively  with  the  register  above 
it  aad  the  one  below  it  to  see  if  they  need  aergiag  or  swapping. 
This  is  aceoaplished  with  two  signals,  ADD  aad  SWAP,  which  are 
coapated  froa  the  address  portion  of  the  registers. 

Bach  of  the  four  coabiaatioas  of  ADD  aad  SWAP  have  aeaaiag  as 
follows,  where  A  aad  B  are  the  upper  and  lower  registers  being 
coapared,  aad  it  is  desired  that  aessages  with  the  saae  destination  get 
added,  but  that  iacoaiag  aessages  filter  up. 


ADD 

SWAP 

MEANING 

0 

0 

(A  <  B) 

0 

1 

(A  >  B) 

1 

0 

(A  -  B)  (not  0) 

1 

1 

(A  -  B  -  0) 

These  two  signals  are  eoaputed  with  NOB  gates  distributed  through 
the  coapare/swap  cell,  aad  then  are  used  to  select  outputs  for  the 
eoaparator  cell  aad  a  corresponding  adder  cell  in  the  following 

aaaaer : 


-T 
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COMPARATOR 

ADDER 

ADD  SVAP 

Aout  Bout 

Aout  Bout 

0  0 

0  1 

1  0 

1  1 

A  B 

B  A 

0  B 

B  0 

A  B 

B  A 

0  A+B 

A+B  0 

A  demonstration  of  the  u*  of  the  array  as  a  sorter  it  given  in  the 
table  below,  where  messages  are  represented  as  address :data  and  the 
pairs  of  messages  being  eompared  are  boxed. 


Note  that  besides  the  message  with  the  greatest  address  (5:4) 
mowing  to  the  bottom  (right  side)  of  the  array,  the  ineoming  message 
(0:1)  mowed  to  the  top  (left  side),  and  the  two  messages  with  the  same 
destination  (1:2  and  1:3)  merged  into  one  (1:5). 

The  layouts  for  the  compare /swap  cell  and  the  add/swap  cell  are 
shown  in  Figures  4  and  5. 

2*2.  Xh£  yp.4,*t C  Section 

The  fraction  of  the  update  section  is  to  collect  the 
contributions  arriwing  from  the  communications  section  and  update  the 
actiwation  lewel  register.  Current  work  in  actiwation/ inhibition 
networks  use  thresholds  to  turn  cells  on  and  off,  decay 
functions  to  awoid  over-activation,  and  warious  other  mechanisms. 
Before  this  cell  could  be  fabricated,  a  real  update  section  would 
hawe  to  be  designed  and  built,  most  likely  with  a  microprogrammed 
ALB  or  seweral  operation  specification  bits. 

2.2*  Xh£  Communication  Section 

The  last  section  of  each  half  of  the  actiwation  cell  is  the 
communications  srea.  There  is  a  sorting  memory  which  holds,  sorts 
and  merges  messages.  Since  each  message  contains  a  relatiwe 
address,  the  address  is  decremented  before  the  message  is  sent. 
Since  the  overlapped  processes  of  message  passing  and  merging  never 
stops,  empty  (0:0)  messages  may  get  sent.  For  this  reason,  the 
deerementer  needs  to  not  decrement  a  zero,  for  that  would  cause  a  long, 
useless  message  to  propagate. 
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Tlia  dierointtr  is  bit  slios  design  with  a  ripple  carry  and  a 
distributed  NOR  gate  for  deteetiag  the  sero  oondition.  The  layout  of 
a  single  bit  decreaenter  cell  is  shown  in  Figure  6. 

£.  Farther  Work 

More  work  needs  to  be  done  in  order  to  see  if  this  Message 
aerging  process  can  be  extended  to  a  higher  level  of  physical 
eoaneetiwity  than  two,  and  on  the  design  and  layout  of  a  prograaaable 
activation  leral  update  function.  Also,  the  load/inpnt  section  needs 
a  filter  to  raaove  links  with  xero  weight. 

1.  Cone Ins ion 

Activation/ inhibition  networks  are  important  techniques  for 
research  in  cognitive  siaalation.  While  it  is  easy  to  sianlate  these 
networks  on  serial  aachines,  aerions  studies  of  large  networks  can 
cause  aneh  thrashing.  Given  the  dropping  cost  of  building  hardware 
and  the  need  to  find  processing  techniques  which  take  advantage  of 
the  aassive  concurrency  potential  of  SHSI11,  the  union  of  these 
two  technologies  any  be  quite  profitable.  With  that  in  aind,  I  have 
deaonstrated  how  parallel  siaalation  of  activation/ inhibition  networks 
can  be  done,  and  done  efficiently  using  a  siaple,  regular  structure 
in  VLSI. 


11 


8H8I  ■  Super  Huaoagus  Seale  Integration.. 
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